Helquat dyes targeting G-quadruplexes as a new class of anti-HIV-1 inhibitors

The secondary structure of nucleic acids containing quartets of guanines, termed G-quadruplexes, is known to regulate the transcription of many genes. Several G-quadruplexes can be formed in the HIV-1 long terminal repeat promoter region and their stabilization results in the inhibition of HIV-1 replication. Here, we identified helquat-based compounds as a new class of anti-HIV-1 inhibitors that inhibit HIV-1 replication at the stage of reverse transcription and provirus expression. Using Taq polymerase stop and FRET melting assays, we have demonstrated their ability to stabilize G-quadruplexes in the HIV-1 long-terminal repeat sequence. Moreover, these compounds were not binding to the general G-rich region, but rather to G-quadruplex-forming regions. Finally, docking and molecular dynamics calculations indicate that the structure of the helquat core greatly affects the binding mode to the individual G-quadruplexes. Our findings can provide useful information for the further rational design of inhibitors targeting G-quadruplexes in HIV-1.

Helquats inhibit different stages of the HIV-1 life cycle. It has been reported that compounds with multiple positive charges can unspecifically block viral entry 47 . Because the compounds which we tested are protonated under physiological conditions 48 , their antiviral effect could result from virion damage or reduced virion binding and entry into target cells. To investigate this possibility, we performed adsorption and replication assays.
To address virion binding and entry in a competitive setting, we performed an adsorption assay where test compounds and virions were added to MT-4 cells simultaneously and were washed out after 2 h of incubation. In parallel, we performed a replication assay where MT-4 cells were infected with HIV-1 and the tested compounds were added 1 h later. In both cases, the infection was evaluated by measuring reverse transcriptase activity in a cell-free supernatant 5 days after the infection (Fig. 3). As controls, PBS (phosphate-buffered saline) and inhibitors of different HIV-1 life-cycle steps were used: AMD (AMD3100), an inhibitor of gp120 attachment to CXCR4 www.nature.com/scientificreports/ receptors of the target cell, thus preventing HIV-1 adsorption and entry; AZT (azidothymidine), an inhibitor of reverse transcriptase; and RAL (raltegravir), an integrase inhibitor. Some compounds (namely 1, 5, 10, 13, 16, 17, 18 and 19) inhibited neither adsorption nor replication, or their effect was weak, whereas other compounds (2, 4, 8, 9 and 11) inhibited adsorption. Compounds compromising viral entry were excluded from further analyses because their effect is probably nonspecific, mainly because their cationic nature. The results of these two experiments led to the selection of compounds 3, 6, 7, 12, 14 and 15, which had minimal or no effect on virus adsorption while significantly impairing viral replication.   www.nature.com/scientificreports/ points, but the initial time point was 2 h post-infection to eliminate their virucidal/adsorption/entry effects. Several inhibitors of the individual steps of the HIV-1 life cycle were used as controls to resolve the post-entry steps of the HIV-1 life cycle. MT-4 cells were infected with NL4-3 HIV-1 and the infection was allowed to proceed for 1 h. Subsequently, the unbound virions were washed out, and the cells were resuspended in fresh media. The compounds were first added 2 h after the infection. The culture media were harvested 31 h later and the number of released virions was monitored through the activity of reverse transcriptase in the culture media (Fig. 4). All the compounds presented in Fig. 4, except for 7, which was excluded from further experiments due to large results variability, were active when added at early time points of the experiment (i.e. two and 3 h postinfection), but only some of them (12 and 14) preserved their activity at later time points. In comparison with the control compounds (AZT-inhibitor of reverse transcriptase and RAL-inhibitor of integrase), these results suggest that the effect of 3, 6 and 15 is mainly manifested when these compounds are administered before or early during reverse transcription; therefore, these compounds are likely to act already at the RNA level. By contrast, the second group of compounds (12 and 14) can be added later (up to 8 h), when reverse transcription is mostly completed, indicating that these compounds might influence the G-quadruplexes also at the DNA level.

Helquats regulate G-quadruplexes in
Helquats stabilize G-quadruplexes in the HIV-1 LTR promoter. Perrone et al. 11 have published that the conserved region of the HIV-1 promoter spanning the NF-κB site and Sp1 sites contains G-tracts involved in   www.nature.com/scientificreports/ the formation of mutually exclusive G4 structures. Furthermore, they have shown the stabilization of these G4s by quadruplex-binding ligands, such as TMPyP4 or BRACO-19 and the formation of an additional G4, which is induced by these ligands. The aforementioned experiments have indicated that active compounds might modify the stability of HIV-1 LTR G4s. We have used a construct corresponding to the G-rich part of the core HIV-1 promoter (positions − 105/ − 48 with respect to the transcription initiation site) with FAM and TAMRA moieties placed at their 5′-and 3′-ends, respectively. The stability of the G4s formed in the presence or absence of G4-binding ligands has been assessed by melting-curve experiments monitored based on the increase in FAM-fluorophore fluorescence (a representative examples of FRET melting curves is in Supplement Information Fig. S1).
The results show that all the compounds tested at the concentration of 2 μM stabilize G-quadruplexes in the HIV-1 LTR (0.2 μM concentration), as demonstrated by an increase in melting temperature (T m ), albeit to a different extent (ΔT m of 10-28 °C, Table 1). This experiment has confirmed that the compounds tested are able to stabilize G-quadruplexes in vitro and indicates that the observed HIV-1 inhibitory effect of the tested compounds might be caused by their stabilization of G-quadruplexes in the HIV-1 LTR region.
Helquats bind mostly to the 6 and 6′ G-patches of the HIV-1 promoter. To address specifically the positions in the HIV-1 promoter region that are affected by the binding of G4-stabilizing agents, we performed a Taq polymerase stop assay. The wild-type LTR oligonucleotide was extended at the 5′end to include a primer-annealing region and used as a template for a single-cycle Taq polymerase reaction. The elongation of the radioactively labelled primer was performed at 45 °C in the presence or absence of the tested compounds and controls. Most of the tested compounds (16 of the 19 compounds originally selected for this study) exhibit some degree of G-quadruplex stabilization at the concentration of 2.5 µM, as manifested by an increase in the intensity of the bands in the individual G-rich patches, even though the extent of the observed effect is different.
The majority of the active candidates selected from the previous experiments (Fig. 5A, the entire gels for all tested compounds can be found in Supplementary Fig. S2) mostly stabilize in the region corresponding to the G-patch 6. Compounds 3, 6 and 15 show strong stop bands also in G-patch 6′. Some level of stabilization was observed for G-patch 5 as well, especially after incubation with 3 and 6 and partially also with 15 ( Fig. 5B, quantification of G-patch regions 5, 6 and 6′). Interestingly, the tested helquats exhibit a different pattern, indicating the involvement of different bases/G-tracts in the stabilized G-quadruplexes.
To evaluate the potency of G-quadruplex-stabilizing compounds, we performed this assay at different compound concentrations (5, 2.5 and 1 μM; Supplementary Fig. S2). This experiment revealed the concentration dependency of the stabilization effect of helquats. The strongest pausing of Taq polymerase was observed for 3, 6, 15 and partially 12, which exhibit their stabilization effect even at 1 μM concentration (Fig. 5C).
Helquats bind specifically to G-quadruplexes, not only to G-rich regions. There are many small molecular probes for specific G4 visualization in cells 49 . Helquats have been shown to have certain optical properties that make them attractive for use as target-specific fluorescence light-up probes for the recognition of targets such as dsDNA or AT-rich DNA sequences 45 . Therefore, we speculated that the most potent helquats presented in our study might show fluorescence light-up in the presence of G-quadruplexes. To distinguish specifically the potential light-up triggered by helquat binding to G-quadruplexes in the HIV-1 LTR, we employed two control oligonucleotides, both unable to form G-quadruplexes (Fig. 6A): an LTR oligonucleotide sequence with two point mutations precluding G4 formation and an LTR oligonucleotide with a scrambled sequence (both adopted from 41 ).
Among the compounds tested, only 12 exhibits strong and specific fluorescence after incubation with a quadruplex-forming oligonucleotide (Fig. 6B). The intensity of fluorescence measured after the incubation with both oligonucleotides incompetent of G-quadruplex formation was almost identical, with the maximum intensity 2.5 times lower than that of the G-quadruplex-forming oligonucleotide (wt). This indicates that 12 does not bind non-specifically to G-rich regions but requires and selectively recognizes G-quadruplex formation. To examine the specificity of ligand 12, we have repeated the light-up experiment with single-stranded G4 LTR in parallel with double-stranded G4 LTR and additional G-quadruplexes (c-myc, c-kit1 and h-telo). We have observed that ligand 12 exhibit strong fluorescence after incubation with a quadruplex forming oligonucleotide ss G4 LTR and Table 1. FRET analysis of G-quadruplex melting temperatures in the presence of tested compounds at a concentration of 2 μM. PDS (pyridostatin)-positive control, PBS (buffer)-negative control. The ΔT m was calculated by subtracting the average T m of PBS (control) from average T m for each compound. The T m results were averaged from three independent experiments, depicted as mean and standard error. www.nature.com/scientificreports/  www.nature.com/scientificreports/ only very weak fluorescence after incubation with ds G4 LTR ( Supplementary Fig. S3A). The incubation of ligand 12 with other G-quadruplexes showed strong fluorescence in the case of c-myc and c-kit1 oligonucleotides and low fluorescence in the case of h-telo oligonucleotide (Supplement Fig. S3A). This suggests that ligand 12 is not specific to HIV LTR quadruplexes but can bind also to other G-quadruplexes. To further evaluate the specificity of ligand 12, we performed the light-up titration experiment incubating compound 12 with serial dilutions (1:1) of oligonucleotides (ss G4 LTR wild type, G4 c-myc and c-kit1) spanning final concentrations 2 to 0.002 μM (Supplement Fig. S3B). Here we observed that at oligonucleotides concentration of 0.5 μM and below, compound 12 exhibits stronger fluorescence after incubation with ss G4 LTR than after incubation with c-myc and c-kit1. This can indicate a slight preference of ligand 12 for ss G4 LTR than for c-myc at lower oligonucleotide concentrations.
Different classes of compounds have distinct modes of binding to G-quadruplexes. It has already been shown that the promoter region of HIV-1 can form alternative-configuration G4s with different G-patches involved or even induced by G4-binding ligands 11 . The structures of two predominant HIV-1 LTR G4s (LTR-III, comprising G-patches 3-6, and LTR-IV, comprising G-patches 4-6′) have been characterized earlier by NMR spectroscopy 38,50 and are shown in Fig. 7A. Both G4s are mutually exclusive and show highly distinctive folding and features. Based on the structure of the helquat 'core' , all the tested compounds can be categorized into two groups: HQ [6,6] dyes (3, 6 and 15) and HQ [7,7] dyes (12 and 14). In accordance with previously described results, we selected one representative of each group, namely 6 and 12 (because they are not cytotoxic and also have a different binding patterns to HIV-1 LTR G-patches), and used computational methods to provide a rationale for their interaction with LTR-III and LTR-IV, respectively. Both enantiomers (M-and P-) of 6 and 12 have been docked into ten conformations of the available NMR structures of HIV-1 LTR-III and LTR-IV 38,50 . For the assessment of their stability, we ran 1 µs molecular dynamics (MD) in explicit solvent. Most of the poses of 6 and 12 docked to LTR-III and LTR-IV were not stable, unbound and rebound on either side of the G-quartet ( Supplementary Fig. S4), where they stayed for hundreds of ns. Those poses that were stable in MD bound at the junction of the G-quartet and the stem-loop of LTR-III and stacked on the bottom of the G-quartet of LTR-IV (Fig. 7B). These calculations indicate that the structures and the flexibilities of the individual G4s strongly affect the preferred binding mode of the helquats.
We have observed larger flexibility of the loops in the case of LTR-III (average RMSD of DNA of 6.8 Å) than in LTR-IV (3.6 Å). This behaviour is inherent in the structures of the LTRs (average RMSDs of 6.3 Å and 3.5 Å, respectively, from unliganded LTR MD). It should be noted that the flexibility is attributed to the loops whereas the G-quadruplexes are stable.
The calculations of interaction energy using MM-GBSA have predicted the order of affinity (from the strongest to the weakest) as LTR-III/6 > LTR-III/12 > LTR-IV/12 > LTR-IV/6. This indicates that: (i) both dyes prefer the LTR-III conformation of G4 over the LTR-IV conformation, (ii) compound 6 binds to LTR-IV much more weakly than compound 12, and (iii) the affinity differences between the M-and P-stereoisomers are larger for LTR-III (7.6 and 8.2 kcal/mol for 6 and 12, respectively, in favour of the M isomer) than for LTR-IV (0.6 and 5.3 kcal/mol in favour of 6/M and 12/P, respectively). These calculations indicate that the structure of the helquat core strongly affects the binding mode to the individual G4s. www.nature.com/scientificreports/

Discussion
More than 500 helquat-based compounds were synthesized and subsequently screened for their anti-HIV-1 activity. This activity was a factor for the selection of 19 compounds with EC 50 ˂ 4 µM, whose mechanism of action was then evaluated. After the exclusion of compounds that targeted virus attachment or entry, the timing of the action was determined for the remaining candidates. It was revealed that in order for some compounds (3, 6 and 15) to be active, they had to be administered within 5 h after infection, which corresponded to the time frame of the initial phase of reverse transcription. This implies that they might act via binding to G4s at the RNA level and interfere with the course of reverse transcription. A similar activity time frame was previously published for the G4 ligand BRACO-19, which inhibited the reverse transcription process at the template level 40 . G4s have multiple roles during HIV-1 reverse transcription. It is known that the HIV-1 genome consists of two singlestranded RNA molecules and that both RNA strands are required for the reverse transcription to be successfully completed. HIV-1 RNA dimerises directly without the need for a protein co-factor, and it has been described that this interaction is driven by the formation of an interstrand G4 51 . Furthermore, it is known that the G4 regions of the HIV-1 RNA genome form bi-molecular G4 structures corresponding to HIV-1 recombination hotspots 36 . Other tested compounds (12 and 14) could be administered later (up to 8 h post-infection) and still retain some level of antiviral activity, which provides a sufficient time frame for the completion of reverse transcription. Consequently, they might express their activity also at the DNA level, either before or after integration, but this does not exclude the possibility for them to act at the RNA level as well. It should be noted that the reference compounds in the TOA assay are added at 100-fold their EC 50 values, but our compounds did not exhibit sufficient selectivity (with the selectivity index ranging between 11.7 and 29.9). Therefore, it is remarkable that we observed such a clear effect even at concentrations only five times greater than their EC 50 values. Based on the antiviral activity of the G4-binding ligands and its timing, it can be assumed that during viral infection, the G4 structure is formed both in the RNA genome and in the promoter region of the HIV-1 provirus and that it, presumably, has important biological roles in both environments.
The G-cluster in the HIV-1 promotor has previously been shown to be a fine modulator of HIV-1 transcription. Several studies have shown that LTR G4s act as repressor elements of viral transcription, the effect being augmented by G4 ligands 36,40,41 . G-patches 3″-6′ are conserved across HIV-1 sequences, with slight differences in the number of G residues (two to four), in length of linkers (two to five nucleotides) and in their base composition. Interestingly, the common feature shared by all of these sequences is their ability to form stable G4 structures 37 . This observation further supports the assumption that the G4 structure is an important element of the HIV-1 promoter region that is maintained by a strong evolutionary pressure 39 .
It has already been shown that the promoter region of HIV-1 can form alternative G4s with different G-patches involved and that additional conformations of G4 can even be induced by G4-binding ligands 11 or proteins 52 . The structures of two predominant HIV-1 LTR G4s (LTR-III, comprising G-patches 3-6, and LTR-IV, consisting of G-patches 4-6′) were characterized by NMR spectroscopy 38,50 . The two G4s are mutually exclusive and show very distinctive folding (Fig. 7). Furthermore, their effect on HIV-1 transcription is quite the opposite: The stabilization of LTR-III inhibits viral transcription and thus might promote latency 11 whereas the formation of LTR-IV increases transcriptional activity and might result in provirus reactivation 50 . These observations have led to the assumption that the balance of LTR-III and LTR-IV may act as a fine regulator of HIV-1 promoter activity. Indeed, we have observed different stop patterns after the treatment of the LTR template with the compounds studied. All compounds belonging to the HQ [6,6] group of dyes (3, 6 and 15) have shown a similar pattern with strong stop bands at G-patches 6 and, 6′ and weaker Taq polymerase pausing also before G-patch 5. By contrast, the representatives of HQ [7,7] group of dyes (12 and 14) have forced stop bands only in G-patch 6.
We tried to link these results to LTR structures, dynamics and compound binding. Using molecular dynamics, we confirmed the structural stability of both LTR-III and LTR-IV G4 and the greater flexibility of the loops of the former. We obtained stable binding modes of two selected representatives (6 from the HQ [6,6] group of dyes and 12 from the HQ [7,7] group of dyes) in both stereoisomeric forms. They were characterized by stacking on one side of the G4s coupled with interactions with the loops. These are typical modes of binding to G4s and have been observed for many ligands by X-ray and NMR methods 53 . The predicted affinities, such as binding free energies, were calculated by the MM-GBSA method as the sum of interaction and deformation free energies (see the "Materials and methods"). These numbers should be interpreted with great caution due to inaccuracies of nucleic-acid and ligand force fields, incomplete sampling and approximations in MD (e.g. the lack of polarization) as well as MM-GBSA (e.g. implicit-solvent) models. Therefore, not even the related, more advanced MM-PBSA method can quantitatively describe G4 conformations from 2.5-µs-long MD simulations 54 . Moreover, entropy effects beyond the scope of the current study, such as the effect of two isoenergetic binding sites on G4 or M/P stereoisomer interconversion in solution, can further contribute to the binding of 6. These challenges in G4-specific ligand design warrant a multidisciplinary approach involving biology, chemistry and computations.
The regulatory mechanisms of G4s involve not only steric hindrance to transcription and/or translation, but also the binding of the protein factors that modulate G4 conformation and serve as a scaffold for the recruitment of additional protein regulators. Cellular proteins regulate viral latency and effective transcription by inducing or unfolding HIV-1 LTR G4s 41,52,55 . The G4-forming sequence in HIV-1 LTR overlaps with three Sp1 as well as two NFκB binding sites and is crucial for transcription initiation 11 . Sp1 has been shown to bind not only the putative 5′-GGG CGG -3′ sequence in double-stranded DNA, but also the G4 secondary structure 56 , the latter of which it even binds to with a stronger affinity 57 . Sp1 has an ambiguous role in the HIV-1 life cycle: It is a strong activator of provirus expression, but it is also associated with maintaining virus latency 58 . The intriguing fact that HIV-1 LTR possesses two modes of Sp1 binding (G4 structure and consensus sequence) opens new questions about G4 folding and subsequently the role of Sp1 in regulating provirus transcription and HIV-1 latency. Such diversity indicates that HIV-1 uses the G4-forming ability of its promoter region as a complex switching www.nature.com/scientificreports/ mechanism involving subtle Sp1 interaction differences to fine-tune the transcriptional output. This carefully balanced checkpoint of the HIV-1 life cycle can be exploited as a new target in antiretroviral therapy. In this study, we present a group of chemically similar compounds where subtle modifications of the helquat core lead to a different mode of binding to the G4 region of the HIV-1 promoter. Having strong positive charge these compounds represent poor candidates for antiviral agents at this stage, but they might still serve as lead compounds for further rationale-based drug design. Our experiments show that the presented helquat dyes selectively bind G4s but not only to HIV-1 LTR G4s but can bind also to other G4s e.g. c-myc (as documented by the light-up experiment with 12). There is a great structural variability of G quadruplexes, differing in the composition and length of the loops and stems 59,60 . It is therefore difficult, if not impossible, to find structural reasons why any ligand, such as helquat would not bind to other types of G quadruplexes. On the other hand, absolute specificity to HIV-1 G4 to achieve antiviral activity might not be necessary 19 . Each infected cells produces many viral genomes, for example the estimated burst size (number of virions released from infected cell) for HIV is between 1000 and 3000 virions per infected cells. Thus we can assume that HIV G4s exceeds the host cells G4s. Finally, these helquat dyes are also able to distinguish even the distinct G4 conformations (as documented by the different affinities of 6, and to a lesser extent even of 12, to LTR-III and LTR-IV quadruplexes). Such diversity could be exploited to design specific ligands able to bind selectively different G4s formed in the HIV-1 LTR and thus specifically regulate the HIV-1 life cycle. Furthermore, these compounds might be a useful tool for determining how and when G4s regulate HIV-1 transcription, which would enhance our understanding of HIV-1 latency and reactivation.

Materials and methods
Oligonucleotides, drugs, plasmids and cells. All oligonucleotides were purchased from Sigma- Table S1 FRET melting assay. The G4 LTR FRET oligonucleotide was diluted to 0.4 μM in LiCaco buffer (10 mM lithium cacodylate at pH 7.2 and 100 mM KCl), heat-denatured at 95 °C for 5 min, and folded into a G4 structure at room temperature for 4 h. After incubation, the LiCaco buffer was added alone or with the compounds being tested (4 μM in LiCaco buffer and 1% DMSO). The final concentrations for compound screening were: 0.2 μM oligonucleotide, 2 μM tested compound, 100 mM KCl and 0.5% DMSO. Fluorescence melting curves were determined using a Realplex 4 Mastercycler (Eppendorf). After the first equilibration step at 25 °C for 5 min, a stepwise increase of temperature with a resolution of 0.1 °C was performed to reach 95 °C, with FAM emission monitored in each step. T m was either calculated directly by the Realplex 4 Mastercycler, or raw data were analysed in GraphPad Prism 7, where the asymmetric (five-parameter) model was used to obtain melting temperatures T m (°C) for the further calculation of ΔT m . The ΔT m was calculated by subtracting the average T m of PBS (control) from the average T m for each compound. Three independent experiments were performed. www.nature.com/scientificreports/ Taq polymerase stop assay. A Taq polymerase stop assay was carried out as described in a previous study 11 . Briefly, the primer (G4 LTR primer, Supplementary Light-up experiments. The selected compounds were incubated in the presence of the G4 LTR wild-type (wt) oligonucleotide, the G4 LTR M4 + M5 oligonucleotide containing mutations that disabled G-quadruplex formation (described previously 41 ), and the G4 LTR scrambled-sequence oligonucleotide, or the buffer-only control. The compounds were tested at a compound:oligonucleotide ratio of 2:1 (4 μM compounds and 2 μM oligonucleotide) in PBS in a 384-well plate. Emission spectra were measured after excitation at 280 nm in a Tecan Infinite M1000 plate reader. The fluorescent image of the plate was taken with the Bio-Rad ChemiDoc imaging system after UV-light excitation. The Light-up titration experiment was performed similarly as described above with the exception that 4 μM compound 12 was incubated with serial dilutions (1:1) of oligonucleotides (ss G4 LTR wild type, G4 c-myc and G4 c-kit1) spanning final concentrations 2 to 0.002 μM. The excitation at 280 nm and emission at 600 nm were used to monitor the Light-up.

Aldrich (Supplementary
Computational modelling. Both P-and M-enantiomers of compounds 6 and 12 were docked to all 10 + 10 conformations of the available NMR structures of G4s (LTR-III: 6H1K, LTR-IV: 2N4Y) using Glide SP 62 . Based on visual inspection and the docking score, 49 of the resulting complexes were selected for further evaluation. To assess the stability of the binding modes, 1 µs molecular dynamics (MD) simulations were run in an explicit solvent for the complexes under study, free ligands and G4s. The ligands were parametrized by the antechamber module of AMBER (ambertools20, forcefield: GAFF2 [Amber 2020, University of California, San Francisco] with RESP charges (HF/6-31G*) calculated using the Gaussian 16 package (Gaussian 16, revision A.03, Gaussian, Inc., Wallingford CT, 2016). The systems were prepared in Leap (DNA forcefield: OL15, water model: TIP3P), neutralized with Na + , and the NaCl concentration was adjusted to 0.15 M. Hydrogen mass repartitioning was applied to make it possible to use a time step of 4 fs. The systems were simulated using the GPU-accelerated pmemd code in AMBER20 in the following steps: 1000 cycles of minimization, 50 ps of heating to 300 K with the solute restrained (force constant: 2.0 kcal/mol/Å 2 ), 200 ps of NpT equilibration with the solute restrained (force constant: 2.0 kcal/mol/Å 2 ), 500 ps of NpT equilibration with the solute restrained (force constant: 0.1 kcal/mol/Å 2 ) and 1 µs of production without restraints, using a time step of 4 fs. The stability of the 500 frames obtained was analysed using pytraj, and the stable poses were subjected to binding free energy calculations using the MM-GBSA approach 63 by MMPBSA.py in ambertools20 with default parameters except for ionic strength, which was set to 0.1 M (istrng = 0.1). The individual terms were calculated according to Eqs. (1-4).

Data availability
All data generated and analysed during this study are included in this published article (and its Supplementary Information files). www.nature.com/scientificreports/